Inter-strand distance distribution of DNA near melting 
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The distance distribution between complementary base pairs of the two strands of a DNA molecule 
is studied near the melting transition. Scaling arguments are presented for a generalized Poland- 
Scheraga type model which includes self-avoiding interactions. At the transition temperature and 
for a large distance r the distribution decays as l/r K with k = 1 + (c — 2) /v. Here v is the self- 
avoiding walk correlation length exponent and c is the exponent associated with the entropy of an 
open loop in the chain. Results for the distribution function just below the melting point are also 
presented. Numerical simulations which fully take into account the self-avoiding interactions are in 
good agreement with the scaling approach. 

PACS numbers: 87.14.Gg, 05.70.Fh, 64.10+h, 63.70.+h 
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I. INTRODUCTION 

Melting or denaturation of DNA, whereby the two 
strands of the molecule unbind upon heating, has been a 
subject of interest for several decades. In experiments 
carried out since the sixties, calorimctric and optical 
melting curves have yielded information on the behavior 
of the order parameter (fraction of bounded complemen- 
tary base pairs) near the transition [Q. This parameter 
gives a global measure for the average degree of open- 
ing of the molecule. With the recent advent of novel 
experimental techniques which allow for single molecule 
manipulations, it has become possible to obtain more de- 
tailed information on the microscopic configurations of 
fluctuating DNA. For example, the time scale of open- 
ing and closing of loops of denaturated segments and 
some information about their steady-state distribution 
may be obtained by fluorescence correlation spectroscopy 
techniques Additional information is also gained by 
studies of the response of the molecule to stretching, un- 
zipping and torsional forces J|, [|, ||, |§, @] ■ 

Theoretically the melting transition has been studied 
within two main classes of models. The first, which we re- 
fer to as Poland-Scheraga (PS) type models || [|, |l(], [Tl| , 
considers the molecule as being composed of an alter- 
nating sequence of double-stranded segments and denat- 
urated loops. Within the model, weights are assigned 
to bound segments and unbound loops from which the 
nature of the transition may be deduced. In a second 
approach which has been employed to study the melting 
transition jl2) , the DNA is considered as a directed poly- 
mer (DP). Here the two strands are described as directed 
random walks and they interact through a short-range at- 
tractive potential. Using a transfer matrix method the 
melting transition may be studied. 

Within the directed polymer approach the distance dis- 
tribution of complementary base pairs is readily calcula- 
ble. However, realistic geometrical restrictions (such as 
self-avoiding interactions) are not taken into account due 



to the over simplifying directed polymer description. On 
the other hand, within the PS type models geometri- 
cal restrictions may be accounted for more realistically. 
It has recently been demonstrated |H| [l4|, [ll| that a 
generalization of this model which includes the repulsive 
self-avoiding interactions between the various segments 
of the DNA chain may be analyzed. This is done using 
a scaling approach forgeneral polymer networks intro- 
duced by Duplantier [|l6[ 0. The results for the nature 
of the transition and for the loop-size distribution are 
in very good agreement with recent numerical studies 
]l8| , |l9, |2(|. In the PS type models the order parame- 
ter and the loop size distribution near the transition are 
readily calculable. However, as defined, these models do 
not yield the inter-strand distance distribution. It would 
be interesting to generalize the scaling picture of the PS 
type models in order to study the inter-strand distance 
distribution close to the melting point. 

In this Paper we study the distance distribution be- 
tween complementary base pairs of the two strands 
within a PS approach. We derive scaling results valid 
both at and below the melting temperature and ver- 
ify their validity by extensive numerical simulations of 
a model on a lattice which fully embodies excluded vol- 
ume interactions. 

The Paper is organized as follows: In Section II we de- 
rive a scaling picture for the inter-strand distance prob- 
ability distribution both at and below the melting point. 
Scaling relations linking the exponents of the loop size 
and distance distributions are provided. The results of 
numerical studies of the distribution functions confirm- 
ing the scaling picture are given in Section III. The main 
results are summarized in Section IV. 



II. SCALING APPROACH 

We start by briefly reviewing the main results of the 
PS approach. Within the framework of these models one 
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assigns length dependent weights to both bound and un- 
bound segments. A bound segment is energetically fa- 
vored over an unbound segment, while an unbound seg- 
ment (loop) is entropically favored over a bound one. A 
bound segment of length I is assigned a weight w l , where 
w = cxp(— Eq/T), Eq is the base pair binding energy, T 
is the temperature and the Boltzmann constant ks is set 
to 1. Here it is assumed that only complementary base 
pairs interact with each other. The binding energy Eq 
is taken to be the same for all base pairs. An unbound 
segment (loop) of length I is assigned a weight 
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where s is a non-universal geometrical constant and c 
is an exponent which is determined by some universal 
properties of the loop configurations. The nature of the 
melting transition depends on the value of the exponent 
c For c < 1 there is no transition, for 1 < c < 2 the 
transition is continuous, while for c > 2 the transition is 
first order. 

Early works [p"l| have evaluated the exponent c by enu- 
merating random walks which return to the origin yield- 
ing c = dj 2 in d dimensions. The inclusion of excluded 
volume interactions within a loop gives c — dv [ flO[ , where 
v is the correlation length exponent of a self-avoiding ran- 
dom walk. Here the self-avoiding interactions between a 
loop and the rest of the chain are neglected. Both these 
estimates predict a continuous transition (c < 2) for any 
d < 4. Numerical simulations of chains of length of up 
to 3000 where self-avoiding interactions have been fully 
taken into account suggested that in fact the transition 
in the infinitely long chain limit is first order |jl8|| . Re- 
cently it has been suggested |TJ|, [l4|, [l5| that excluded 
volume interactions between a loop and the rest of the 
chain may be taken approximately into account using re- 
sults for polymer networks of arbitrary topology jl6|, [l7| . 
It has been shown that for loops much smaller than the 
chain length the entropy of the loop has the same form 
as in Eq. ([l]), but with c = dv — 2a^. Here 03 is an expo- 
nent associated with an order three vertex configuration 
defined and evaluated in jl(| . In d — 3 the exponent 
c may be estimated to be c ~ 2.11. Since c > 2 the 
transition is first order. Within the PS type models the 
weight of a loop of size I, Pi oop (l) is given by 



(2) 



where & ~ \T - T M \~ 1/{c ~ l) for 1< c < 2 and & ~ \T - 
Tm\~ 1 for c > 2. Here Tm is the melting temperature. In 
a recent numerical study ]l9| the loops size distribution 
at the melting transition has been evaluated for chains of 
length up to 200 where self-avoidance is fully taken into 
account. These simulations yield c w 2.10(4) in good 
agreement with the theoretical estimate. 

We now use a scaling approach to study the comple- 
mentary base-pair distance distribution PdistM- The 



probability that, within a loop of size 21, two comple- 
mentary base-pairs are separated by r, scales as 



(3) 



where r = \r\ and / is a scaling function. To obtain 
-PdistM we integrate over the contribution of all loops 
and over the angular degrees of freedom du: 



P, 



dist 



(r) ~ J™ dl P loop (0 J dw r d - Y lP(r, I) 



(4) 



Note that the contribution of each loop is lP(r, I) since 
each loop contains / matching pairs and thus contributes 
I times its average distance to the average of Pdist(?*)- 
Inserting Eqs. (||) and (||) into Eq. (||), one finds 
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At the transition one has £ ; 1 = 0, and the integral scales 
with r as 



where 



k = l + {c-2)/v 



(6) 



(7) 



The estimated values for the exponents c ~ 2.11 and 
v = 0.588 in d = 3 yield k ~ 1.19. 

Next we consider the distance distribution below the 
transition where ^j" 1 > 0. Simple scaling analysis can 
not be carried out and one has to take a specific form for 
the function /. A general argument due to Fisher |2^] 
for the end to end distance of a self-avoiding walk yields 
the following form of f(x) for x ^> 1 



f(x) ~ exp(-L»x 1-" ) , 



(8) 



where /1 is a known exponent. This argument may 
be generalized to consider the average distance between 
complementary pairs within a loop, or a loop embedded 
in a chain, yielding the same form but with a different 
exponent u, (to be discussed below). Using this form the 
integral (0) may be evaluated using a saddle-point ap- 
proximation. This gives 



-Pdist(r) 



exp(-r/£ r ) 



r'l 



for r>( r , with 

7? = c-l/2-(l-z/)0u + o0 . 



(9) 
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The characteristic distance f> is related to the length 
by 



£r OC 



for £1 — > 00 



(11) 
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so that £r cx |T - T^-"/^ 1 ) for 1 < c < 2 and £ r oc 
|T — Tm|~" for c > 2. In our case c ~ 2.11 and therefore 
we expect ^ J T 1 oc |T - T M \ V . 

Within the approach introduced in |l^| the exponent 
\x in the distribution function (Q) should be evaluated by 
considering the average inter-strand distance in a loop 
embedded in a chain. Here for simplicity we adopt the 
approach of Fisher j2j| and consider the exponent [i in 
the inter-strand distance distribution within an isolated 
loop. Thus the effect of self-avoiding interactions between 
the loop and the rest of the chain on the exponent fi is 
not taken into account. The calculation is rather lengthy 
and it is outlined in Appendix A. The resulting exponent 
is found to be 

M = j— — (1/2 + 2d(v — 1/2) — 7) , (12) 
1 — 1/ 

where 7 is the exponent associated with the number of 
configurations of a random walk of length L as given 
by s L L 7 ~ 1 . Thus, for a random, non self-avoiding loop 
where 7=1 and v — 1/2 one has /i = — 1 for any d. On 
the other hand for a self- avoiding loop in d = 3, where 
7 « 1.18 one has /i = —0.37. An estimate for 77 may be 
obtained by using the c value of an isolated loop (namely 
dv) in equation (|l^) together with the above value of 
fi to yield rj « 0.18. It would be of interest to derive 
an expression for /j, in the case of a loop embedded in 
a chain in order to fully take into account the effect of 
self-avoiding interactions. 

It is instructive to compare these results with the dis- 
tance distributions obtained within the DP approach. 
The exponent c characterizes the number of directed 
walks which return to the origin for the first time. This 
is known to be given by [^2| c = 2 — d/2 for d < 2 and 
c = d/2 for d > 2. In d = 2 there are logarithmic cor- 
rections so that the number of configurations behaves as 
s l /{I In 2 I). Thus, one expects a continuous melting tran- 
sition for d < 4 and a first order phase transition for 
d > 4. Clearly, the correlation length exponent satisfies 
v = 1/2. Using these results one obtains at criticality 

kdp = d- 3 for d > 2 (13) 
= 1 - d for d < 2 . (14) 

Eqs. (p3[)-(pT[) are in agreement with calculations using 
a transfer matrix method for the DP model j?3|. Be- 
low criticality our results predict that the distance dis- 
tribution decays exponentially with r with £ r cx \T — 
T M \~ 1/l2 - dl for d < 4 and £ r cx \T - T M \~ 1/2 for d > 4. 
Also, using /j, = — 1 for the DP model one has i] = 0. 
These results are again in agreement with known results 
for the DP model (23). 

III. NUMERICAL SIMULATIONS 

In order to test the predictions of this scaling pic- 
ture, we carried out extensive numerical simulations of 
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FIG. 1: Log-log plots of F\ oop (l) at T M = 0.7455 for several 
chain length N. One can identify a linear region (whose range 
increases with N) with slope — c = —2.14(4) (dotted line). 

the loop size and inter-strand distance distributions of a 
model of fluctuating DNA (T|, [j]§. The DNA strands 
are represented by two self-avoiding walks of length TV 
on a cubic lattice. The numerical simulations are carried 
out by using the pruned-enriched Rosenbluth (PERM) 
Monte Carlo method f24| , which has already been em- 
ployed in recent studies of DNA denaturation |lj| po) . 
This method generates DNA chain configurations by a 
growth algorithm at fixed T. Each configuration consists 
of two complementary sequences of N unit steps between 
nearest neighbor sites of the lattice, both starting from 
a common origin. Self avoidance is achieved by forbid- 
ding overlapping of sites. This constraint is relaxed only 
to introduce an interaction between complementary sites 
(with the same index along the two strands), which are 
allowed to overlap, with an energy gain Eq — — 1. In this 
way the total number of these contacts gives the energy 
gain, —E, and the Boltzmann weight of a DNA configu- 
ration is exp(E/T). In order to recover the equilibrium 
distribution in the simulation, one has to assign a suit- 
able weight for each growth step of the chain |2j, |2^] . In 
the present work we have modified the growth rules in 
order to achieve a better performance at lower T, where 
ordinary PERM yields slower convergence. In the usual 
PERM rules, long open segments which have low weights 
at low T are generated, and they are thus often pruned. 
This makes it difficult to generate sufficiently long and 
loop-rich chains by this procedure. In order to avoid this 
problem we have introduced a small bias for the growing 
ends to recombine. This bias is compensated by a suit- 
able reweighting of the generated chain, to yield to cor- 
rect equilibrium distribution. The results of this study, 
which are described below, corroborate the scaling pic- 
ture introduced above. 

We start by first considering the loop size distribution 
at the melting temperature Tm = 0.7455. This distribu- 
tion has been studied in the past for chains of length up 
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FIG. 2: Log-log plots of Pdiat(f) at Tm for several chain length 
TV. The slope — n = —1.24, derived from Eq. (m) and plotted 
as a dotted line in the log-log scale, is consistent with the 
trend developing in the distributions of longer chains. 



4: The characteristic length £ ; 1 as a function of \T 
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FIG. 3: Collapse plot of PdistM according to a scaling form 
PdiBt(r,N) ~ r~ K g(r/N v ), where v ~ 0.59. 



to N = 320 monomers [|9[ ^(J . In Figure [j] we present 
the results for chains of length up to N = 1280. We 
find c = 2.14(4), which is in good agreement with the 
analytical estimate c ~ 2.11 and the previous numerical 
estimates obtained from simulations of shorter chains. 

The complementary-pair distance distribution at the 
melting point is plotted in Figure |^ for systems of size 
up to N — 1280. We find that the decay exponent is 
given by k — 1.24(7) which is the expected value from 
the scaling relation (|7j) given the measured value of c. A 
direct estimate of k from the data is not easy, since the 
power law behavior has a cut-off at values of r which are 
much smaller than those for the I distribution. However, 
the algebraic decay of Pdist(f) is confirmed by the good 
collapse plot shown in Fig. pi 

We now consider the distribution functions below the 
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melting temperature and study the behavior of the length 
scales and £ r . Motivated by the asymptotic form (^) 
for the loop size distribution we extract £j by fitting 
lnPi 00 p(l) to the form 



ya - 1/0 - cbxl. 



(15) 



where yo is a constant. This fit is carried out for several 
values of the temperature near Tm using c = 2.14. For 
each temperature T, the values of is obtained at dif- 
ferent chain lengths N and is then extrapolated to the 
limit N — > oo. The resulting temperature dependence of 
the extrapolated is displayed in Figure |J. Indeed, the 
expected linear dependence of on the temperature 
difference |T — Tm\ is observed near Tm- 

In order to obtain £ r we carried out a similar fit of 
InPdistM to the form 



2/i - r/i r - T] In r , 



(16) 



where the constant y\ and the parameter r\ are left as 
free parameters. Unfortunately, the numerical estimate 
of 77 is rather crude, yielding 0.5 < 77 < 1.2. In Figure || 
we present a plot of as a function of £j~ . This graph 
is consistent with the expected form (|Tl|). 

Finally, we note that the scaling relations (0) and jlO| ) 
are rather general and are not restricted to models where 
self-avoiding interactions are taken into account. Re- 
cently, Garel, Monthus and Orland (GMO) (||] have 
introduced a model for DNA denaturation where self- 
avoidance within each strand is neglected while mutual 
avoidance is included. Each strand is a simple random 
walk and thus v = 1/2 for this model. Numerical re- 
sults obtained with the PERM method for this model 
in d = 3 dimensions yield c = 2.55(5) and k = 2.1(1) 
p7| . It is readily seen that these exponents satisfy the 
scaling relation (R). In fact, for the GMO model one 
can also develop a PS type of descrition p7| analogous 
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FIG. 5: Parameter ^^T 1 as a function of ^j -1 , for T < Tm- 
Errors are indicated. For each T, we evaluate £ r by a non- 
linear fit of lnPriist fr) of the form (|l6|). The solid line is a fit 
using the form (Jl 1|) . 



to that of Refs. JT^, [u| |T^], but based this time on a 
block copolymer network picture ^0|. This description 
gives analytical c estimates consistent with the numerical 
results. 



IV. SUMMARY 

In this paper we studied the inter-strand distance dis- 
tribution for DNA at and near the melting point. A scal- 
ing analysis within Poland- Scheraga type models where 
self-avoiding interactions are taken into account is pre- 
sented. A scaling relation is derived (Eq. (7)) between 
the exponents c and k which govern the decay at melt- 
ing of probability distributions of loop lengths and of 
interstrand distances, respectively. Results of extensive 
numerical simulations are found to be in agreement with 
the scaling approach. 

The DNA melting transition has been studied so far 
either with PS type models or with the directed polymer 
approach. While in the latter case k and c are easily 
computable, in the PS models the interstrand distance 
distribution, and thus the associated exponent k, has not 
yet been discussed. Our analytical and numerical results 
for k thus provide new insight into the geometry of DNA 
at melting, enabling one to make more quantitative com- 
parisons between the two types of approach. 
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APPENDIX A: THE EXPONENT (i FOR A 
SELF-AVOIDING LOOP 

The exponent /i may be evaluated for a self-avoiding 
loop by generalizing the approach of McKenzie and 
Moore |E8| who calculated this exponent for a self- 
avoiding walk. This generalization closely follows the 
derivation in [B8[ and thus we will only briefly outline 
it here. The quantity of interest is the probability dis- 
tribution for two complementary bases within a ring of 
length I. Equivalcntly this may be viewed as the proba- 
bility of two chains which are bound at one end, to reach 
the same point f. In this probability all possible length 
li and 1% with l\ + I2 = I are considered. 

To this end we first consider the generating function 
of two chains held together at one end and which are not 
restricted to return to the same point r 



T(r u r 2 ,6) = 

OO 



(Al) 



Here Ci tj i 2 is the number of configurations of two chains 
held together at one end, and Pi l ,i 2 (f\,f2) is the proba- 
bility that the free end of one chain is at f\ and the free 
end of the other chain is at ri- The distribution func- 
tion is given by T(f, f, 8) where 9 is a chemical pot enti al 
which controls the chain length li + I2 in the sum (Al). 

The Fourier transform of the Green's function (Al) of 
the two chains can be assumed to have the Ornstein and 
Zernike form at small momenta q\ and q\ , 



r(<fi,92,6>) 



qi'P 



(A2) 



Moreover, since the two chains are bound together at one 
end their total number of configurations is just that of 
one chain of length Zi + Zs- Namely, j 2 = s'Z 7 " 1 where 
I = l\ + I2 and Z 7_1 is the usual enhanceme nt f actor for 
a self-avoiding random walk. Using this in ( |Al| ) one can 
easily show that for small 



r(o,o, 



(h+i2)„-e(h+h) 



; 1= i,Z 2 =i 

fl-7-1 



(A3) 



which after comparison with (A2), implies that p = 4 — 
(7 +!)/"■ 

The quantity of interest is the probability 



Pl(r,e)= J2 P h,h(r,r) 



(A4) 



h+l 2 =l 



This can be calculated by first inverting ( |A2[ ) to obtain 
r(ri,r2,#). Setting rj = f 2 = f yields 



T(f, r, ( 



yv( P +d-3) r l~d e 26"r 



(A5) 
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On e th en has to carry out an inverse Laplace transform the real part of any singularity of r(r,r,6). The result 
of ( A5) in order to extract Pi(r, 9). One obtains of this calculation has the expected form (g) with 



1 



X+iir 



Cim9)S ~ l = ^iJ x .^ d6eWr ^ 6 y ( A6 ) M (1/2 + 2^-1/2)- 7 ) . (A7) 

where C\ — Ci lt i 2 with I = l\ + li and X is larger than 
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